Biblio
The video surveillance widely installed in public areas poses a significant threat to the privacy. This paper proposes a new privacy preserving method via the Generalized Random-Grid based Visual Cryptography Scheme (GRG-based VCS). We first separate the foreground from the background for each video frame. These foreground pixels contain the most important information that needs to be protected. Every foreground area is encrypted into two shares based on GRG-based VCS. One share is taken as the foreground, and the other one is embedded into another frame with random selection. The content of foreground can only be recovered when these two shares are got together. The performance evaluation on several surveillance scenarios demonstrates that our proposed method can effectively protect sensitive privacy information in surveillance videos.
Unmanned Aerial Systems (UAS) have raised a great concern on privacy recently. A practical method to protect privacy is needed for adopting UAS in civilian airspace. This paper examines the privacy policies, filtering strategies, existing techniques, then proposes a novel method based on the encrypted video stream and the cloud-based privacy servers. In this scheme, all video surveillance images are initially encrypted, then delivered to a privacy server. The privacy server decrypts the video using the shared key with the camera, and filters the image according to the privacy policy specified for the surveyed region. The sanitized video is delivered to the surveillance operator or anyone on the Internet who is authorized. In a larger system composed of multiple cameras and multiple privacy servers, the keys can be distributed using Kerberos protocol. With this method the privacy policy can be changed on demand in real-time and there is no need for a costly on-board processing unit. By utilizing the cloud-based servers, advanced image processing algorithms and new filtering algorithms can be applied immediately without upgrading the camera software. This method is cost-efficient and promotes video sharing among multiple subscribers, thus it can spur wide adoption.
Surveillance video streams monitoring is an important task that the surveillance operators usually carry out. The distribution of video surveillance facilities over multiple premises and the mobility of surveillance users requires that they are able to view surveillance video seamlessly from their mobile devices. In order to satisfy this requirement, we propose a cloud-based IPTV (Internet Protocol Television) solution that leverages the power of cloud infrastructure and the benefits of IPTV technology to seamlessly deliver surveillance video content on different client devices anytime and anywhere. The proposed mechanism also supports user-controlled frame rate adjustment of video streams and sharing of these streams with other users. In this paper, we describe the overall approach of this idea, address and identify key technical challenges for its practical implementation. In addition, initial experimental results were presented to justify the viability of the proposed cloud-based IPTV surveillance framework over the traditional IPTV surveillance approach.
H.264/advanced video coding surveillance video encoders use the Skip mode specified by the standard to reduce bandwidth. They also use multiple frames as reference for motion-compensated prediction. In this paper, we propose two techniques to reduce the bandwidth and computational cost of static camera surveillance video encoders without affecting detection and recognition performance. A spatial sampler is proposed to sample pixels that are segmented using a Gaussian mixture model. Modified weight updates are derived for the parameters of the mixture model to reduce floating point computations. A storage pattern of the parameters in memory is also modified to improve cache performance. Skip selection is performed using the segmentation results of the sampled pixels. The second contribution is a low computational cost algorithm to choose the reference frames. The proposed reference frame selection algorithm reduces the cost of coding uncovered background regions. We also study the number of reference frames required to achieve good coding efficiency. Distortion over foreground pixels is measured to quantify the performance of the proposed techniques. Experimental results show bit rate savings of up to 94.5% over methods proposed in literature on video surveillance data sets. The proposed techniques also provide up to 74.5% reduction in compression complexity without increasing the distortion over the foreground regions in the video sequence.
This paper presents a human model-based feature extraction method for a video surveillance retrieval system. The proposed method extracts, from a normalized scene, object features such as height, speed, and representative color using a simple human model based on multiple-ellipse. Experimental results show that the proposed system can effectively track moving routes of people such as a missing child, an absconder, and a suspect after events.
Many surveillance cameras are using everywhere, the videos or images captured by these cameras are still dumped but they are not processed. Many methods are proposed for tracking and detecting the objects in the videos but we need the meaningful content called semantic content from these videos. Detecting Human activity recognition is quite complex. The proposed method called Semantic Content Extraction (SCE) from videos is used to identify the objects and the events present in the video. This model provides useful methodology for intruder detecting systems which provides the behavior and the activities performed by the intruder. Construction of ontology enhances the spatial and temporal relations between the objects or features extracted. Thus proposed system provides a best way for detecting the intruders, thieves and malpractices happening around us.
This paper presents the relative merits of IR and microwave sensor technology and their combination with wireless camera for the development of a wall mounted wireless intrusion detection system and explain the phases by which the intrusion information are collected and sent to the central control station using wireless mesh network for analysis and processing the collected data. These days every protected zone is facing numerous security threats like trespassing or damaging of important equipments and a lot more. Unwanted intrusion has turned out to be a growing problem which has paved the way for a newer technology which detects intrusion accurately. Almost all organizations have their own conventional arrangement of protecting their zones by constructing high wall, wire fencing, power fencing or employing guard for manual observation. In case of large areas, manually observing the perimeter is not a viable option. To solve this type of problem we have developed a wall-mounted wireless fencing system. In this project I took the responsibility of studying how the different units could be collaborated and how the data collected from them could be further processed with the help of software, which was developed by me. The Intrusion detection system constitutes an important field of application for IR and microwave based wireless sensor network. A state of the art wall-mounted wireless intrusion detection system will detect intrusion automatically, through multi-level detection mechanism (IR, microwave, active RFID & camera) and will generate multi-level alert (buzzer, images, segment illumination, SMS, E-Mail) to notify security officers, owners and also illuminate the particular segment where the intrusion has happened. This system will enable the authority to quickly handle the emergency through identification of the area of incident at once and to take action quickly. IR based perimeter protection is a proven technology. However IR-based intrusion detection system is not a full-proof solution since (1) IR may fail in foggy or dusty weather condition & hence it may generate false alarm. Therefore we amalgamate this technology with Microwave based intrusion detection which can work satisfactorily in foggy weather. Also another significant arena of our proposed system is the Camera-based intrusion detection. Some industries require this feature to capture the snap-shots of the affected location instantly as the intrusion happens. The Intrusion information data are transmitted wirelessly to the control station via multi hop routing (using active RFID or IEEE 802.15.4 protocol). The Control station will receive intrusion information at real time and analyze the data with the help of the Intrusion software. It then sends SMS to the predefined numbers of the respective authority through GSM modem attached with the control station engine.
The availability of sophisticated source attribution techniques raises new concerns about privacy and anonymity of photographers, activists, and human right defenders who need to stay anonymous while spreading their images and videos. Recently, the use of seam-carving, a content-aware resizing method, has been proposed to anonymize the source camera of images against the well-known photoresponse nonuniformity (PRNU)-based source attribution technique. In this paper, we provide an analysis of the seam-carving-based source camera anonymization method by determining the limits of its performance introducing two adversarial models. Our analysis shows that the effectiveness of the deanonymization attacks depend on various factors that include the parameters of the seam-carving method, strength of the PRNU noise pattern of the camera, and an adversary's ability to identify uncarved image blocks in a seam-carved image. Our results show that, for the general case, there should not be many uncarved blocks larger than the size of 50×50 pixels for successful anonymization of the source camera.
Road In this paper, we focus on both the road vehicle and pedestrians detection, namely obstacle detection. At the same time, a new obstacle detection and classification technique in dynamical background is proposed. Obstacle detection is based on inverse perspective mapping and homography. Obstacle classification is based on fuzzy neural network. The estimation of the vanishing point relies on feature extraction strategy, which segments the lane markings of the images by combining a histogram-based segmentation with temporal filtering. Then, the vanishing point of each image is stabilized by means of a temporal filtering along the estimates of previous images. The IPM image is computed based on the stabilized vanishing point. The method exploits the geometrical relations between the elements in the scene so that obstacle can be detected. The estimated homography of the road plane between successive images is used for image alignment. A new fuzzy decision fusion method with fuzzy attribution for obstacle detection and classification application is described. The fuzzy decision function modifies parameters with auto-adapted algorithm to get better classification probability. It is shown that the method can achieve better classification result.
Mobile and aerial robots used in urban search and rescue (USAR) operations have shown the potential for allowing us to explore, survey and assess collapsed structures effectively at a safe distance. RGB-D cameras, such as the Microsoft Kinect, allow us to capture 3D depth data in addition to RGB images, providing a significantly richer user experience than flat video, which may provide improved situational awareness for first responders. However, the richer data comes at a higher cost in terms of data throughput and computing power requirements. In this paper we consider the problem of live streaming RGB-D data over wired and wireless communication channels, using low-power, embedded computing equipment. When assessing a disaster environment, a range camera is typically mounted on a ground or aerial robot along with the onboard computer system. Ground robots can use both wireless radio and tethers for communications, whereas aerial robots can only use wireless communication. We propose a hybrid lossless and lossy streaming compression format designed specifically for RGB-D data and investigate the feasibility and usefulness of live-streaming this data in disaster situations.
The lack of qualification of a common operating picture (COP) directly impacts the situational awareness of military Command and Control (C2). Since a commander is reliant on situational awareness information in order to make decisions regarding military operations, the COP needs to be trustworthy and provide accurate information for the commander to base decisions on the resultant information. If the COP's integrity is questioned, there is no definite way of defining its integrity. This paper looks into the integrity of the COP and how it can impact situational awareness. It discusses a potential solution to this problem on which future research can be based.
In this paper we propose a twitter sentiment analytics that mines for opinion polarity about a given topic. Most of current semantic sentiment analytics depends on polarity lexicons. However, many key tone words are frequently bipolar. In this paper we demonstrate a technique which can accommodate the bipolarity of tone words by context sensitive tone lexicon learning mechanism where the context is modeled by the semantic neighborhood of the main target. Performance analysis shows that ability to contextualize the tone word polarity significantly improves the accuracy.
Spatial-multiplexing cameras have emerged as a promising alternative to classical imaging devices, often enabling acquisition of `more for less'. One popular architecture for spatial multiplexing is the single-pixel camera (SPC), which acquires coded measurements of the scene with pseudo-random spatial masks. Significant theoretical developments over the past few years provide a means for reconstruction of the original imagery from coded measurements at sub-Nyquist sampling rates. Yet, accurate reconstruction generally requires high measurement rates and high signal-to-noise ratios. In this paper, we enquire if one can perform high-level visual inference problems (e.g. face recognition or action recognition) from compressive cameras without the need for image reconstruction. This is an interesting question since in many practical scenarios, our goals extend beyond image reconstruction. However, most inference tasks often require non-linear features and it is not clear how to extract such features directly from compressed measurements. In this paper, we show that one can extract nontrivial correlational features directly without reconstruction of the imagery. As a specific example, we consider the problem of face recognition beyond the visible spectrum e.g in the short-wave infra-red region (SWIR) - where pixels are expensive. We base our framework on smashed filters which suggests that inner-products between high-dimensional signals can be computed in the compressive domain to a high degree of accuracy. We collect a new face image dataset of 30 subjects, obtained using an SPC. Using face recognition as an example, we show that one can indeed perform reconstruction-free inference with a very small loss of accuracy at very high compression ratios of 100 and more.
The threats of smartphone security are mostly from the privacy disclosure and malicious chargeback software which deducting expenses abnormally. They exploit the vulnerabilities of previous permission mechanism to attack to mobile phones, and what's more, it might call hardware to spy privacy invisibly in the background. As the existing Android operating system doesn't support users the monitoring and auditing of system resources, a dynamic supervisory mechanism of process behavior based on Dalvik VM is proposed to solve this problem. The existing android system framework layer and application layer are modified and extended, and special underlying services of system are used to realize a dynamic supervisory on the process behavior of Dalvik VM. Via this mechanism, each process on the system resources and the behavior of each app process can be monitored and analyzed in real-time. It reduces the security threats in system level and positions that which process is using the system resource. It achieves the detection and interception before the occurrence or the moment of behavior so that it protects the private information, important data and sensitive behavior of system security. Extensive experiments have demonstrated the accuracy, effectiveness, and robustness of our approach.
We propose a dense continuous-time tracking and mapping method for RGB-D cameras. We parametrize the camera trajectory using continuous B-splines and optimize the trajectory through dense, direct image alignment. Our method also directly models rolling shutter in both RGB and depth images within the optimization, which improves tracking and reconstruction quality for low-cost CMOS sensors. Using a continuous trajectory representation has a number of advantages over a discrete-time representation (e.g. camera poses at the frame interval). With splines, less variables need to be optimized than with a discrete representation, since the trajectory can be represented with fewer control points than frames. Splines also naturally include smoothness constraints on derivatives of the trajectory estimate. Finally, the continuous trajectory representation allows to compensate for rolling shutter effects, since a pose estimate is available at any exposure time of an image. Our approach demonstrates superior quality in tracking and reconstruction compared to approaches with discrete-time or global shutter assumptions.
A number of blind Image Quality Evaluation Metrics (IQEMs) for Unmanned Aerial Vehicle (UAV) photograph application are presented. Nowadays, the visible light camera is widely used for UAV photograph application because of its vivid imaging effect; however, the outdoor environment light will produce great negative influences on its imaging output unfortunately. In this paper, to conquer this problem above, we design and reuse a series of blind IQEMs to analyze the imaging quality of UAV application. The Human Visual System (HVS) based IQEMs, including the image brightness level, the image contrast level, the image noise level, the image edge blur level, the image texture intensity level, the image jitter level, and the image flicker level, are all considered in our application. Once these IQEMs are calculated, they can be utilized to provide a computational reference for the following image processing application, such as image understanding and recognition. Some preliminary experiments for image enhancement have proved the correctness and validity of our proposed technique.
Image sharpness measurements are important parts of many image processing applications. To measure image sharpness multiple algorithms have been proposed and measured in the past but they have been developed with having out-of-focus photographs in mind and they do not work so well with images taken using a digital microscope. In this article we show the difference between images taken with digital cameras, images taken with a digital microscope and artificially blurred images. The conventional sharpness measures are executed on all these categories to measure the difference and a standard image set taken with a digital microscope is proposed and described to serve as a common baseline for further sharpness measures in the field.
The detection of obstacles is a fundamental issue in autonomous navigation, as it is the main key for collision prevention. This paper presents a method for the segmentation of general obstacles by stereo vision with no need of dense disparity maps or assumptions about the scenario. A sparse set of points is selected according to a local spatial condition and then clustered in function of its neighborhood, disparity values and a cost associated with the possibility of each point being part of an obstacle. The method was evaluated in hand-labeled images from KITTI object detection benchmark and the precision and recall metrics were calculated. The quantitative and qualitative results showed satisfactory in scenarios with different types of objects.
Salt and Pepper Noise is very common during transmission of images through a noisy channel or due to impairment in camera sensor module. For noise removal, methods have been proposed in literature, with two stage cascade various configuration. These methods, can remove low density impulse noise, are not suited for high density noise in terms of visible performance. We propose an efficient method for removal of high as well as low density impulse noise. Our approach is based on novel extension over iterated conditional modes (ICM). It is cascade configuration of two stages - noise detection and noise removal. Noise detection process is a combination of iterative decision based approach, while noise removal process is based on iterative noisy pixel estimation. Using improvised approach, up to 95% corrupted image have been recovered with good results, while 98% corrupted image have been recovered with quite satisfactory results. To benchmark the image quality, we have considered various metrics like PSNR (Peak Signal to Noise Ratio), MSE (Mean Square Error) and SSIM (Structure Similarity Index Measure).
In this paper we present a framework for Quality of Information (QoI)-aware networking. QoI quantifies how useful a piece of information is for a given query or application. Herein, we present a general QoI model, as well as a specific example instantiation that carries throughout the rest of the paper. In this model, we focus on the tradeoffs between precision and accuracy. As a motivating example, we look at traffic video analysis. We present simple algorithms for deriving various traffic metrics from video, such as vehicle count and average speed. We implement these algorithms both on a desktop workstation and less-capable mobile device. We then show how QoI-awareness enables end devices to make intelligent decisions about how to process queries and form responses, such that huge bandwidth savings are realized.
This paper describes Smartpig, an algorithm for the iterative mosaicking of images of a planar surface using a unique parameterization which decomposes inter-image projective warps into camera intrinsics, fronto-parallel projections, and inter-image similarities. The constraints resulting from the inter-image alignments within an image set are stored in an undirected graph structure allowing efficient optimization of image projections on the plane. Camera pose is also directly recoverable from the graph, making Smartpig a feasible solution to the problem of simultaneous location and mapping (SLAM). Smartpig is demonstrated on a set of 144 high resolution aerial images and evaluated with a number of metrics against ground control.
We present the novel concept of Controllable Face Privacy. Existing methods that alter face images to conceal identity inadvertently also destroy other facial attributes such as gender, race or age. This all-or-nothing approach is too harsh. Instead, we propose a flexible method that can independently control the amount of identity alteration while keeping unchanged other facial attributes. To achieve this flexibility, we apply a subspace decomposition onto our face encoding scheme, effectively decoupling facial attributes such as gender, race, age, and identity into mutually orthogonal subspaces, which in turn enables independent control of these attributes. Our method is thus useful for nuanced face de-identification, in which only facial identity is altered, but others, such gender, race and age, are retained. These altered face images protect identity privacy, and yet allow other computer vision analyses, such as gender detection, to proceed unimpeded. Controllable Face Privacy is therefore useful for reaping the benefits of surveillance cameras while preventing privacy abuse. Our proposal also permits privacy to be applied not just to identity, but also to other facial attributes as well. Furthermore, privacy-protection mechanisms, such as k-anonymity, L-diversity, and t-closeness, may be readily incorporated into our method. Extensive experiments with a commercial facial analysis software show that our alteration method is indeed effective.
Keeping a driver focused on the road is one of the most critical steps in insuring the safe operation of a vehicle. The Strategic Highway Research Program 2 (SHRP2) has over 3,100 recorded videos of volunteer drivers during a period of 2 years. This extensive naturalistic driving study (NDS) contains over one million hours of video and associated data that could aid safety researchers in understanding where the driver's attention is focused. Manual analysis of this data is infeasible; therefore efforts are underway to develop automated feature extraction algorithms to process and characterize the data. The real-world nature, volume, and acquisition conditions are unmatched in the transportation community, but there are also challenges because the data has relatively low resolution, high compression rates, and differing illumination conditions. A smaller dataset, the head pose validation study, is available which used the same recording equipment as SHRP2 but is more easily accessible with less privacy constraints. In this work we report initial head pose accuracy using commercial and open source face pose estimation algorithms on the head pose validation data set.
With the increasing popularity of wearable devices, information becomes much easily available. However, personal information sharing still poses great challenges because of privacy issues. We propose an idea of Visual Human Signature (VHS) which can represent each person uniquely even captured in different views/poses by wearable cameras. We evaluate the performance of multiple effective modalities for recognizing an identity, including facial appearance, visual patches, facial attributes and clothing attributes. We propose to emphasize significant dimensions and do weighted voting fusion for incorporating the modalities to improve the VHS recognition. By jointly considering multiple modalities, the VHS recognition rate can reach by 51% in frontal images and 48% in the more challenging environment and our approach can surpass the baseline with average fusion by 25% and 16%. We also introduce Multiview Celebrity Identity Dataset (MCID), a new dataset containing hundreds of identities with different view and clothing for comprehensive evaluation.
In order to enhance the supply chain security at airports, the German federal ministry of education and research has initiated the project ESECLOG (enhanced security in the air cargo chain) which has the goal to improve the threat detection accuracy using one-sided access methods. In this paper, we present a new X-ray backscatter technology for non-intrusive imaging of suspicious objects (mainly low-Z explosives) in luggage's and parcels with only a single-sided access. A key element in this technology is the X-ray backscatter camera embedded with a special twisted-slit collimator. The developed technology has efficiently resolved the problem related to the imaging of complex interior of the object by fixing source and object positions and changing only the scanning direction of the X-ray backscatter camera. Experiments were carried out on luggages and parcels packed with mock-up dangerous materials including liquid and solid explosive simulants. In addition, the quality of the X-ray backscatter image was enhanced by employing high-resolution digital detector arrays. Experimental results are discussed and the efficiency of the present technique to detect suspicious objects in luggages and parcels is demonstrated. At the end, important applications of the proposed backscatter imaging technology to the aviation security are presented.