Biblio
To exploit high temporal correlations in video frames of the same scene, the current frame is predicted from the already-encoded reference frames using block-based motion estimation and compensation techniques. While this approach can efficiently exploit the translation motion of the moving objects, it is susceptible to other types of affine motion and object occlusion/deocclusion. Recently, deep learning has been used to model the high-level structure of human pose in specific actions from short videos and then generate virtual frames in future time by predicting the pose using a generative adversarial network (GAN). Therefore, modelling the high-level structure of human pose is able to exploit semantic correlation by predicting human actions and determining its trajectory. Video surveillance applications will benefit as stored “big” surveillance data can be compressed by estimating human pose trajectories and generating future frames through semantic correlation. This paper explores a new way of video coding by modelling human pose from the already-encoded frames and using the generated frame at the current time as an additional forward-referencing frame. It is expected that the proposed approach can overcome the limitations of the traditional backward-referencing frames by predicting the blocks containing the moving objects with lower residuals. Our experimental results show that the proposed approach can achieve on average up to 2.83 dB PSNR gain and 25.93% bitrate savings for high motion video sequences compared to standard video coding.
ISSN: 2642-9357
Air-gapped networks achieve security by using the physical isolation to keep the computers and network from the Internet. However, magnetic covert channels based on CPU utilization have been proposed to help secret data to escape the Faraday-cage and the air-gap. Despite the success of such cover channels, they suffer from the high risk of being detected by the transmitter computer and the challenge of installing malware into such a computer. In this paper, we propose MagView, a distributed magnetic cover channel, where sensitive information is embedded in other data such as video and can be transmitted over the air-gapped internal network. When any computer uses the data such as playing the video, the sensitive information will leak through the magnetic covert channel. The "separation" of information embedding and leaking, combined with the fact that the covert channel can be created on any computer, overcomes these limitations. We demonstrate that CPU utilization for video decoding can be effectively controlled by changing the video frame type and reducing the quantization parameter without video quality degradation. We prototype MagView and achieve up to 8.9 bps throughput with BER as low as 0.0057. Experiments under different environment are conducted to show the robustness of MagView. Limitations and possible countermeasures are also discussed.
In our daily lives, the advances of new technology can be used to sustain the development of people across the globe. Particularly, e-government can be the dynamo of the development for the people. The development of technology and the rapid growth in the use of internet creates a big challenge in the administration in both the public and the private sector. E-government is a vital accomplishment, whereas the security is the main downside which occurs in each e-government process. E-government has to be secure as technology grows and the users have to follow the procedures to make their own transactions safe. This paper tackles the challenges and obstacles to enhance the security of information in e-government. Hence to achieve security data hiding techniques are found to be trustworthy. Reversible data hiding (RDH) is an emerging technique which helps in retaining the quality of the cover image. Hence it is preferred over the traditional data hiding techniques. Modification in the existing algorithm is performed for image encryption scheme and data hiding scheme in order to improve the results. To achieve this secret data is split into 20 parts and data concealing is performed on each part. The data hiding procedure includes embedding of data into least significant nibble of the cover image. The bits are further equally distributed in the cover image to obtain the key security parameters. Hence the obtained results validate that the proposed scheme is better than the existing schemes.
Advancements in semiconductor domain gave way to realize numerous applications in Video Surveillance using Computer vision and Deep learning, Video Surveillances in Industrial automation, Security, ADAS, Live traffic analysis etc. through image understanding improves efficiency. Image understanding requires input data with high precision which is dependent on Image resolution and location of camera. The data of interest can be thermal image or live feed coming for various sensors. Composite(CVBS) is a popular video interface capable of streaming upto HD(1920x1080) quality. Unlike high speed serial interfaces like HDMI/MIPI CSI, Analog composite video interface is a single wire standard supporting longer distances. Image understanding requires edge detection and classification for further processing. Sobel filter is one the most used edge detection filter which can be embedded into live stream. This paper proposes Zynq FPGA based system design for video surveillance with Sobel edge detection, where the input Composite video decoded (Analog CVBS input to YCbCr digital output), processed in HW and streamed to HDMI display simultaneously storing in SD memory for later processing. The HW design is scalable for resolutions from VGA to Full HD for 60fps and 4K for 24fps. The system is built on Xilinx ZC702 platform and TVP5146 to showcase the functional path.
In this paper a novel data hiding method has been proposed which is based on Non-Linear Feedback Shift Register and Tinkerbell 2D chaotic map. So far, the major work in Steganography using chaotic map has been confined to image steganography where significant restrictions are there to increase payload. In our work, 2D chaotic map and NLFSR are used to developed a video steganography mechanism where data will be embedded in the segregated frames. This will increase the data hiding limit exponentially. Also, embedding position of each frame will be different from others frames which will increase the overall security of the proposed mechanism. We have achieved this randomized data hiding points by using a chaotic map. Basically, Chaotic theory which is non-linear dynamics physics is using in this era in the field of Cryptography and Steganography and because of this theory, little bit changes in initial condition makes the output totally different. So, it is very hard to get embedding position of data without knowing the initial value of the chaotic map.
Video surveillance has been widely adopted to ensure home security in recent years. Most video encoding standards such as H.264 and MPEG-4 compress the temporal redundancy in a video stream using difference coding, which only encodes the residual image between a frame and its reference frame. Difference coding can efficiently compress a video stream, but it causes side-channel information leakage even though the video stream is encrypted, as reported in this paper. Particularly, we observe that the traffic patterns of an encrypted video stream are different when a user conducts different basic activities of daily living, which must be kept private from third parties as obliged by HIPAA regulations. We also observe that by exploiting this side-channel information leakage, attackers can readily infer a user's basic activities of daily living based on only the traffic size data of an encrypted video stream. We validate such an attack using two off-the-shelf cameras, and the results indicate that the user's basic activities of daily living can be recognized with a high accuracy.
Multimedia transmission in wireless multimedia sensor networks is often energy constraints. In practice the bit rate resulting from all the multimedia digitization formats are substantially larger than the bit rates of transmission channels that are available with the networks associated with these applications. For the purpose of efficient of storage and transmission of the content, the popular compression technique MPEG4/H.264 has been made used. To achieve better coding efficiency video streaming protocols MPEG4/H.264 uses several techniques which is increasing the complexity involved in computation at the encoder prominently for wireless sensor network devices having lesser power abilities. In this paper we propose energy consumption reduction framework for transmission in wireless networks so that well-balanced quality of service (QoS) in multimedia network can be maintained. The experiment result demonstrate that the effectiveness of the proposed approach in energy efficiency in wireless sensor network where the energy is the critical parameter.
This paper investigates several techniques that increase the accuracy of motion boundaries in estimated motion fields of a local dense estimation scheme. In particular, we examine two matching metrics, one is MSE in the image domain and the other one is a recently proposed multiresolution metric that has been shown to produce more accurate motion boundaries. We also examine several different edge-preserving filters. The edge-aware moving average filter, proposed in this paper, takes an input image and the result of an edge detection algorithm, and outputs an image that is smooth except at the detected edges. Compared to the adoption of edge-preserving filters, we find that matching metrics play a more important role in estimating accurate and compressible motion fields. Nevertheless, the proposed filter may provide further improvements in the accuracy of the motion boundaries. These findings can be very useful for a number of recently proposed scalable interactive video coding schemes.
The video streaming between the sender and the receiver involves multiple unsecured hops where the video data can be illegally copied if the nodes run malicious forwarding logic. This paper introduces a novel method to stream video data through dual channels using dual data paths. The frames' pixels are also scrambled. The video frames are divided into two frame streams. At the receiver side video is re-constructed and played for a limited time period. As soon as small chunk of merged video is played, it is deleted from video buffer. The approach has been tried to formalize and initial simulation has been done over MATLAB. Preliminary results are optimistic and a refined approach may lead to a formal designing of network layer routing protocol with corrections in transport layer.
The enormous size of video data of natural scene and objects is a practical threat to storage, transmission. The efficient handling of video data essentially requires compression for economic utilization of storage space, access time and the available network bandwidth of the public channel. In addition, the protection of important video is of utmost importance so as to save it from malicious intervention, attack or alteration by unauthorized users. Therefore, security and privacy has become an important issue. Since from past few years, number of researchers concentrate on how to develop efficient video encryption for secure video transmission, a large number of multimedia encryption schemes have been proposed in the literature like selective encryption, complete encryption and entropy coding based encryption. Among above three kinds of algorithms, they all remain some kind of shortcomings. In this paper, we have proposed a lightweight selective encryption algorithm for video conference which is based on efficient XOR operation and symmetric hierarchical encryption, successfully overcoming the weakness of complete encryption while offering a better security. The proposed algorithm guarantees security, fastness and error tolerance without increasing the video size.
Selective encryption designates a technique that aims at scrambling a message content while preserving its syntax. Such an approach allows encryption to be transparent towards middle-box and/or end user devices, and to easily fit within existing pipelines. In this paper, we propose to apply this property to a real-time diffusion scenario - or broadcast - over a RTP session. The main challenge of such problematic is the preservation of the synchronization between encryption and decryption. Our solution is based on the Advanced Encryption Standard in counter mode which has been modified to fit our auto-synchronization requirement. Setting up the proposed synchronization scheme does not induce any latency, and requires no additional bandwidth in the RTP session (no additional information is sent). Moreover, its parallel structure allows to start decryption on any given frame of the video while leaving a lot of room for further optimization purposes.
Blind Source Separation (BSS) deals with the recovery of source signals from a set of observed mixtures, when little or no knowledge of the mixing process is available. BSS can find an application in the context of network coding, where relaying linear combinations of packets maximizes the throughput and increases the loss immunity. By relieving the nodes from the need to send the combination coefficients, the overhead cost is largely reduced. However, the scaling ambiguity of the technique and the quasi-uniformity of compressed media sources makes it unfit, at its present state, for multimedia transmission. In order to open new practical applications for BSS in the context of multimedia transmission, we have recently proposed to use a non-linear encoding to increase the discriminating power of the classical entropy-based separation methods. Here, we propose to append to each source a non-linear message digest, which offers an overhead smaller than a per-symbol encoding and that can be more easily tuned. Our results prove that our algorithm is able to provide high decoding rates for different media types such as image, audio, and video, when the transmitted messages are less than 1.5 kilobytes, which is typically the case in a realistic transmission scenario.
The exponential growth of surveillance videos presents an unprecedented challenge for high-efficiency surveillance video coding technology. Compared with the existing coding standards that were basically developed for generic videos, surveillance video coding should be designed to make the best use of the special characteristics of surveillance videos (e.g., relative static background). To do so, this paper first conducts two analyses on how to improve the background and foreground prediction efficiencies in surveillance video coding. Following the analysis results, we propose a background-modeling-based adaptive prediction (BMAP) method. In this method, all blocks to be encoded are firstly classified into three categories. Then, according to the category of each block, two novel inter predictions are selectively utilized, namely, the background reference prediction (BRP) that uses the background modeled from the original input frames as the long-term reference and the background difference prediction (BDP) that predicts the current data in the background difference domain. For background blocks, the BRP can effectively improve the prediction efficiency using the higher quality background as the reference; whereas for foreground-background-hybrid blocks, the BDP can provide a better reference after subtracting its background pixels. Experimental results show that the BMAP can achieve at least twice the compression ratio on surveillance videos as AVC (MPEG-4 Advanced Video Coding) high profile, yet with a slightly additional encoding complexity. Moreover, for the foreground coding performance, which is crucial to the subjective quality of moving objects in surveillance videos, BMAP also obtains remarkable gains over several state-of-the-art methods.
H.264/advanced video coding surveillance video encoders use the Skip mode specified by the standard to reduce bandwidth. They also use multiple frames as reference for motion-compensated prediction. In this paper, we propose two techniques to reduce the bandwidth and computational cost of static camera surveillance video encoders without affecting detection and recognition performance. A spatial sampler is proposed to sample pixels that are segmented using a Gaussian mixture model. Modified weight updates are derived for the parameters of the mixture model to reduce floating point computations. A storage pattern of the parameters in memory is also modified to improve cache performance. Skip selection is performed using the segmentation results of the sampled pixels. The second contribution is a low computational cost algorithm to choose the reference frames. The proposed reference frame selection algorithm reduces the cost of coding uncovered background regions. We also study the number of reference frames required to achieve good coding efficiency. Distortion over foreground pixels is measured to quantify the performance of the proposed techniques. Experimental results show bit rate savings of up to 94.5% over methods proposed in literature on video surveillance data sets. The proposed techniques also provide up to 74.5% reduction in compression complexity without increasing the distortion over the foreground regions in the video sequence.
Unmanned Aerial Systems (UAS) have raised a great concern on privacy recently. A practical method to protect privacy is needed for adopting UAS in civilian airspace. This paper examines the privacy policies, filtering strategies, existing techniques, then proposes a novel method based on the encrypted video stream and the cloud-based privacy servers. In this scheme, all video surveillance images are initially encrypted, then delivered to a privacy server. The privacy server decrypts the video using the shared key with the camera, and filters the image according to the privacy policy specified for the surveyed region. The sanitized video is delivered to the surveillance operator or anyone on the Internet who is authorized. In a larger system composed of multiple cameras and multiple privacy servers, the keys can be distributed using Kerberos protocol. With this method the privacy policy can be changed on demand in real-time and there is no need for a costly on-board processing unit. By utilizing the cloud-based servers, advanced image processing algorithms and new filtering algorithms can be applied immediately without upgrading the camera software. This method is cost-efficient and promotes video sharing among multiple subscribers, thus it can spur wide adoption.