Visible to the public Biblio

Filters: Keyword is Computer vision  [Clear All Filters]
2017-03-08
Boykov, Y., Isack, H., Olsson, C., Ayed, I. B..  2015.  Volumetric Bias in Segmentation and Reconstruction: Secrets and Solutions. 2015 IEEE International Conference on Computer Vision (ICCV). :1769–1777.

Many standard optimization methods for segmentation and reconstruction compute ML model estimates for appearance or geometry of segments, e.g. Zhu-Yuille [23], Torr [20], Chan-Vese [6], GrabCut [18], Delong et al. [8]. We observe that the standard likelihood term in these formu-lations corresponds to a generalized probabilistic K-means energy. In learning it is well known that this energy has a strong bias to clusters of equal size [11], which we express as a penalty for KL divergence from a uniform distribution of cardinalities. However, this volumetric bias has been mostly ignored in computer vision. We demonstrate signif- icant artifacts in standard segmentation and reconstruction methods due to this bias. Moreover, we propose binary and multi-label optimization techniques that either (a) remove this bias or (b) replace it by a KL divergence term for any given target volume distribution. Our general ideas apply to continuous or discrete energy formulations in segmenta- tion, stereo, and other reconstruction problems.

2017-03-07
Kannao, Raghvendra, Guha, Prithwijit.  2016.  Generic TV Advertisement Detection Using Progressively Balanced Perceptron Trees. Proceedings of the Tenth Indian Conference on Computer Vision, Graphics and Image Processing. :8:1–8:8.

Automatic detection of TV advertisements is of paramount importance for various media monitoring agencies. Existing works in this domain have mostly focused on news channels using news specific features. Most commercial products use near copy detection algorithms instead of generic advertisement classification. A generic detector needs to handle inter-class and intra-class imbalances present in data due to variability in content aired across channels and frequent repetition of advertisements. Imbalances present in data make classifiers biased towards one of the classes and thus require special treatment. We propose to use tree of perceptrons to solve this problem. The training data available for each perceptron node is balanced using cluster based over-sampling and TOMEK link cleaning as we traverse the tree downwards. The trained perceptron node then passes the original unbalanced data to its children. This process is repeated recursively till we reach the leaf nodes. We call this new algorithm as "Progressively Balanced Perceptron Tree". We have also contributed a TV advertisements dataset consisting of 250 hours of videos recorded from five non-news TV channels of different genres. Experimentations on this dataset have shown that the proposed approach has comparatively superior and balanced performance with respect to six baseline methods. Our proposal generalizes well across channels, with varying training data sizes and achieved a top F1-score of 97% in detecting advertisements.

Mittal, Gaurav, Yagnik, Kaushal B., Garg, Mohit, Krishnan, Narayanan C..  2016.  SpotGarbage: Smartphone App to Detect Garbage Using Deep Learning. Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing. :940–945.

Maintaining a clean and hygienic civic environment is an indispensable yet formidable task, especially in developing countries. With the aim of engaging citizens to track and report on their neighborhoods, this paper presents a novel smartphone app, called SpotGarbage, which detects and coarsely segments garbage regions in a user-clicked geo-tagged image. The app utilizes the proposed deep architecture of fully convolutional networks for detecting garbage in images. The model has been trained on a newly introduced Garbage In Images (GINI) dataset, achieving a mean accuracy of 87.69%. The paper also proposes optimizations in the network architecture resulting in a reduction of 87.9% in memory usage and 96.8% in prediction time with no loss in accuracy, facilitating its usage in resource constrained smartphones.

2017-02-21
X. Li, J. D. Haupt.  2015.  "Outlier identification via randomized adaptive compressive sampling". 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). :3302-3306.

This paper examines the problem of locating outlier columns in a large, otherwise low-rank, matrix. We propose a simple two-step adaptive sensing and inference approach and establish theoretical guarantees for its performance. Our results show that accurate outlier identification is achievable using very few linear summaries of the original data matrix - as few as the squared rank of the low-rank component plus the number of outliers, times constant and logarithmic factors. We demonstrate the performance of our approach experimentally in two stylized applications, one motivated by robust collaborative filtering tasks, and the other by saliency map estimation tasks arising in computer vision and automated surveillance.

2015-05-04
Jing Li, Ming Chen.  2014.  On-Road Multiple Obstacles Detection in Dynamical Background. Intelligent Human-Machine Systems and Cybernetics (IHMSC), 2014 Sixth International Conference on. 1:102-105.

Road In this paper, we focus on both the road vehicle and pedestrians detection, namely obstacle detection. At the same time, a new obstacle detection and classification technique in dynamical background is proposed. Obstacle detection is based on inverse perspective mapping and homography. Obstacle classification is based on fuzzy neural network. The estimation of the vanishing point relies on feature extraction strategy, which segments the lane markings of the images by combining a histogram-based segmentation with temporal filtering. Then, the vanishing point of each image is stabilized by means of a temporal filtering along the estimates of previous images. The IPM image is computed based on the stabilized vanishing point. The method exploits the geometrical relations between the elements in the scene so that obstacle can be detected. The estimated homography of the road plane between successive images is used for image alignment. A new fuzzy decision fusion method with fuzzy attribution for obstacle detection and classification application is described. The fuzzy decision function modifies parameters with auto-adapted algorithm to get better classification probability. It is shown that the method can achieve better classification result.
 

2015-05-01
Yueguo Zhang, Lili Dong, Shenghong Li, Jianhua Li.  2014.  Abnormal crowd behavior detection using interest points. Broadband Multimedia Systems and Broadcasting (BMSB), 2014 IEEE International Symposium on. :1-4.

Abnormal crowd behavior detection is an important research issue in video processing and computer vision. In this paper we introduce a novel method to detect abnormal crowd behaviors in video surveillance based on interest points. A complex network-based algorithm is used to detect interest points and extract the global texture features in scenarios. The performance of the proposed method is evaluated on publicly available datasets. We present a detailed analysis of the characteristics of the crowd behavior in different density crowd scenes. The analysis of crowd behavior features and simulation results are also demonstrated to illustrate the effectiveness of our proposed method.

Rasheed, N., Khan, S.A., Khalid, A..  2014.  Tracking and Abnormal Behavior Detection in Video Surveillance Using Optical Flow and Neural Networks. Advanced Information Networking and Applications Workshops (WAINA), 2014 28th International Conference on. :61-66.

An abnormal behavior detection algorithm for surveillance is required to correctly identify the targets as being in a normal or chaotic movement. A model is developed here for this purpose. The uniqueness of this algorithm is the use of foreground detection with Gaussian mixture (FGMM) model before passing the video frames to optical flow model using Lucas-Kanade approach. Information of horizontal and vertical displacements and directions associated with each pixel for object of interest is extracted. These features are then fed to feed forward neural network for classification and simulation. The study is being conducted on the real time videos and some synthesized videos. Accuracy of method has been calculated by using the performance parameters for Neural Networks. In comparison of plain optical flow with this model, improved results have been obtained without noise. Classes are correctly identified with an overall performance equal to 3.4e-02 with & error percentage of 2.5.